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Abstract 

Predicting energy consumption in Smart Buildings (SB), and scheduling it, is crucial for deploying Energy-efficient Man- 
agement Systems. Most important, this constitutes a key aspect in the promising Smart Grids technology, whereby loads 
need to be predicted and scheduled in real-time to cope for the strongly coupled variance between energy demand 
and cost. Several approaches and models have been adopted for energy consumption prediction and scheduling. In 
this paper, we investigated available models and opted for machine learning. Namely, we use Artificial Neural Networks 
(ANN) along with Genetic Algorithms. We deployed our models in a real-world SB testbed. We used CompactRIO for ANN 
implementation. The proposed models are trained and validated using real-world data collected from a PV installation 
along with SB electrical appliances. Though our model exhibited a modest prediction accuracy, which is due to the small 
size of the data set, we strongly recommend our model as a blue-print for researchers willing to deploy real-world SB 
testbeds and investigate machine learning as a promising venue for energy consumption prediction and scheduling. 
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1 Introduction 


Smart Grids (SG) have emerged as a solution to the increas- 
ing demand on energy worldwide. The grid refers to the 
traditional electrical grid that is a collection of transmis- 
sion lines, substations, and other components that make 
sure energy is delivered from the power plant to the home 
or business [1]. The smartness in the SG resides in the two- 
way communication between the utility and the custom- 
ers, in addition to the sensing along the lines. The main 
components of a SG are controls, computers, automations, 
in addition to other new technologies that are working 
together to accommodate for the quick increase in the 
energy demand. The SG has many benefits among which 
we state: more efficient energy transmission, improving 


security, reducing peak demand which helps with the 
decrease of electricity rates, etc. SG are also known by the 
use of renewable energy sources. 

The prediction and scheduling are two of the main pil- 
lars of efficient Energy Management Systems (EMS). EMSs 
are very crucial for the well-functioning of the SG. They 
are responsible for managing the power flux within the 
SG elements in order to minimize the costs and optimize 
the quality [2]. 

The prediction of the energy consumed by different 
appliances is one of the building blocks of the concept of 
SGs. The energy consumption can be seen as a nonlinear 
time series with a number of complex factors [3]. With 
many renewable energy sources used in the SGs, the 
energy prediction methods are getting more and more 
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accurate, and hence, the prediction becomes a crucial 
part in the efficient planning of the entire SG. 

There are different approaches that are used for the 
prediction of the energy consumption. The most popular 
ones use machine learning (ML). 

Machine learning (ML) is one of the growing technical 
fields that merge between computer science and statis- 
tics. It tackles the issue of building computers that learn 
through experiences and hence provide more improved 
algorithms. ML keeps witnessing advances thanks to 
the new algorithms and the availability of online data, 
in addition to the accessibility of the computing power 
[4]. Artificial Neural Networks (ANN) are one of the ML 
algorithms are widely used in this context. 

ANNs have seen light in the early 1940s but have not 
been widely used until lately. They became very popu- 
lar thanks to the outstanding results they offer. They are 
very powerful with large datasets which gives the neural 
network enough data to train the model. In brief, ANNs 
are inspired by the way the brain processes information. 
They build an informational processing model that mim- 
ics the work of the neurons in the brain [5]. Their abil- 
ity to learn quickly is what makes ANNs very powerful. 
This learning is done through an information flow that 
goes in two directions. Patterns from the training dataset 
are given to the ANN through the input neurons, then 
goes through the hidden layers and arrives to the output 
neurons. 

Genetic Algorithms (GA) are considered the best solu- 
tion for task and operation scheduling. They emerged from 
the research of Mr. John Holland conducted at the Uni- 
versity of Michigan in 1960. However, it took them almost 
30 years to become popular. The main purpose of GA is to 
solve complex problems where deterministic algorithms 
are considered an expensive solution. The Travelling Sales- 
man Problem or the Knapsack problem are cases in point 
[6]. 

ANN and GA models are usually implemented in com- 
modity computers or lately in Raspberry Pis. The NI Com- 
pactRIO is considered as a good alternative for deploying 
the ANN algorithms. 

NI CompactRIO is a high-performance embedded con- 
troller with Input/Output modules. It has two targets: 
a real-time controller chassis, and an FPGA module. It 
includes a microprocessor to implement control algo- 
rithms and offer a support of a large pool of frequencies. 
The FPGA module is mainly used to accommodate for the 
high speed of certain modules and even certain programs. 
It deals with the data streaming from the I/O modules 
attached to the CompactRIO. The FPGA module is brought 
by Xilinx Virtex. 

The CompactRIO is programmable using a specific 
graphical programming language named LabVIEW. This 
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latter allows a better visualization of the data and an intui- 
tive and easy way to implement control approaches. 

In this paper, we are training an ANN model to predict 
the energy consumed by different appliances in a building. 
The model is developed in Python programming language 
but interfaced with LabVIEW for a potential integration in 
the NI CompactRIO. 

The rest of the paper is organized as follows: Sect. 2 
presents the scope of the research project under which 
this work is done. Section 3 contains the background of 
our work. In Sect. 4, we present the implementation steps 
and discussing the results obtained. Then, we conclude 
and present our future work in Sect. 5. 


2 Literature review 


A lot of work has been carried on in this area by different 
researchers in the community. 

Authors in [7] are presenting a structure of a home 
energy management system to determine the best day- 
ahead scheduling for the different appliances. This sched- 
uling is based on the hour price and the peak power-limit- 
ing-based demand response strategies. In addition to that, 
they introduced a realistic test-case in order to validate 
their schedule. The test showed a significant drop in the 
energy consumed by the different appliances thanks to 
the schedule they designed. 

Sou Kin Cheong et al. presented in [8] a scheduling 
method for smart home appliances based on mixed inte- 
ger linear programming. Furthermore, they took into 
consideration the expected duration and peak power con- 
sumption of the appliances. Based on a previously defined 
tariff, the proposed schedule achieved about 47% of cost 
saving. Furthermore, the authors demonstrated that very 
good solutions can be obtained using very little compu- 
tational power. 

Regarding the energy consumption prediction, a good 
amount of work has been published representing differ- 
ent attempts to predict the energy consumed by differ- 
ent appliances. Elkonomou presented in [9] a prediction 
method based on artificial neural network. In order to 
select the best architecture, the multilayer perceptron 
model was used to make a set of tests to select the one 
with the best generalization. Actual data about input and 
output was used in the training, validation, and testing 
process. 

Authors in [10] are stating the fact that the building 
energy consumption prediction is crucial for efficient 
energy planning and management. To do the prediction, 
they are presenting a model that is data-driven and that 
allows for the energy consumption prediction. The review 
shows that the area of energy consumption prediction has 
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a good amount of gaps that require more research to be 
filled: the prediction of long-term energy consumption, 
the prediction of energy consumed within residential 
buildings, and the prediction of energy consumed by the 
lighting in buildings. The lack of research in these areas 
can be due to the relatively small amount of data that is 
available. 


3 Project scope 


The work done in this paper falls under the scope of a 
research project named MiGrid. 

The project mainly aims at creating a holistic testbed 
platform that couples smart buildings and renewable 
energy storage and production. The general architecture 
of the project is depicted in the Fig. 1 [11]. 

The testbed has six main components: 


1. Wireless sensors This refers to a set of wirelessly con- 
nected sensors forming a Wireless Sensor Network. 
This is responsible for sensing data in a specific con- 
text. 

2. Wireless actuators These are wirelessly connected actu- 
ators that are supposed to translate electrical signals 
into mechanical actions to act on appliances. 

3. Big dataanalytics platform This is a platform that takes 
care of the processing and the storage of data stem- 
ming from the entire SG. 

4. NI CompactRIO It is considered as the main control- 
ler of the system. It decides on whether to inject the 
energy produced to the grid, store it, or use it to power 
loads. 
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Fig. 1 MiGrid general architecture 
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5. Storage device This consists of a lithium battery that 
is Supposed to store the excess of energy and make it 
available for use when needed. 

6. Solar parking lot The main renewable energy source in 
the system. 


4 Background 
4.1 Smart Grids 


By definition, a SG is nothing but an electrical grid that 
integrates a two-way communication system. In traditional 
electrical systems, electricity flows in one direction form 
the power plant to the home or business. The SG comes 
with an improvement to the network that resides in the 
on-the-spot feedback about the operations, power inter- 
ruptions, and electricity consumption. This feedback is 
given back to the power plant and other operators [12]. 

The SG has the ability to tune itself to provide a better 
state of performance, and a better quality of the energy 
delivered. In addition to that, the SG is capable of antici- 
pating problems and disturbances. Hence, the SG will 
allow for a more efficient transmission of electricity, lower 
kilowatt price, and quicker restoration. 

The SG has three main components that are presented 
in Fig. 2. 


— Physical power assets This consists of the power lines, 
transformers, etc. The information given by these assets 
is used by the smart meters and wireless devices, in 
addition to microgeneration and storage devices. 

— Physical communication assets These are mainly the 
access and transport network (e.g. fiber) in addition to 
switches and routers. The main end users for this com- 
ponent are home area network. 

— Software and applications This is mainly about the dis- 
tributed data processing (on-site, off-site, and virtual- 
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Fig.2 Smart grid’s components 
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ized). This includes the management of the grid, the 
load balancing, IT security, and other roles. 


4.2 Energy management systems 


EMSs are supposed to reduce energy consumption up to 
20-30%. They are a set of connected hardware and soft- 
ware that have the following tasks: 


— Monitoring Collection of energy consumption related 
information in order to establish the basis and clarify 
the changes in targets. 

— Control Putting in place control algorithms in order to 
correct any deviations from the target. 


The new EMS definition shows to be more people- 
centric than the older one. It engages users through con- 
necting their actions with the energy consumption. The 
EMS gets the users involved by displaying their real-time 
energy consumption. Making the consumers aware of 
their daily energy usage can have a huge impact on their 
consuming behaviors [13]. 

EMSs are also well known for managing Microgrids 
whereby they have to process large amounts of data stem- 
ming from different sources. EMSs are supposed to use 
that data to control and monitor the Microgrid. Customers 
also can benefit from the EMS as it can provide them useful 
information inferred from the data generated by sensors 
(e.g. energy consumption, electricity prices, etc.) [14]. 


4.3 Machine learning 


Machine learning is one of artificial intelligence (Al) appli- 
cations. It involves providing systems with the ability to 
automatically learn and improve based on experiences 
without previous programming. This learning process is 
based on the exploration of large amounts of data. 

It begins by closely observing the data to look for pat- 
terns that would help making better decisions. ML has a 
set of algorithms categorized as follows [15]: 


— Supervised machine learning algorithms It applies what 
was learnt from past data to predict future events. The 
system provides, after sufficient training, targets for 
new inputs. This algorithm has the possibility to com- 
pare the predicted output to the actual one. 

— Unsupervised machine learning algorithms It is used 
when the output needed to train the model is not clas- 
sified or labeled. In other words, it infers a structure or 
pattern from hidden data. 

— Semi-supervised machine learning algorithms These 
algorithms fall in between the previous two categories 
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as they use both labeled and unlabeled data to train 
models. 


4.4 Time series forecasting methods 


Several methods can be used in the forecasting of time 
series data. Classical time series are somehow sophisti- 
cated and perform well on a wide range of problems. In 
this section, we will be exploring the different methods 
existing. 


4.4.1 Autoregression (AR) 


The Autoregression model (AR) is a model that predicts 
the behavior by learning from previous ones. It works well 
with data that is related to entries that precedes it and that 
follows it. To create the model, only the past data is used, 
hence the name autoregression. The process of creating 
the model is nothing but a linear regression between the 
current data and the past data. 

The AR model is a stochastic model that has a certain 
degree of uncertainty and randomness associated to it. 
This means that the model developed and tested will 
never deliver a 100% accuracy. It gets only close enough 
to be used in prediction scenarios. 

AR models are also referred to as conditional models, 
Markov models, or transitions. 

AR(p) models are autoregressive models where specific 
values are used to predict other values. The p in AR(p) is 
called the order, i.e. AR (1) is a regression model that is only 
interested in values that are one period apart. Eventually, 
models of second and third orders are related to values of 
two or three periods apart. 


4.4.2 Moving average 


A moving average (MA) is a well-known indicator that 
helps smoothing out actions by reducing the noise from 
random data. It is considered a trend-following because it 
is based on past data. 

There are two main moving averages that are widely 
used: the simple moving average, and the exponential 
moving average. The simple moving average is the sim- 
ple average of a security over a defined number of time 
slots. The exponential moving average focuses on the 
recent data entries and gives them higher weights. The 
most common applications of the MA are related to trend 
direction. This includes determining support and resist- 
ance levels. 

Moving averages are useful enough by themselves, but 
they can also be combined with other technical indicators 
to deliver better results [16]. 
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4.4.3 Autoregression integrated moving average 


Autoregression integrated moving average (ARIMA) 
models are considered the most general class of time 
series forecasting models that can be used in conjunc- 
tion with nonlinear alterations (e.g. logging). 

There are two main models of ARIMA: seasonal, and 
non-seasonal models. 

The main applications of this model are in areas of 
short-term forecasting which requires at least 40 past 
data points. It gives better results when the data used 
is stable over time with a minimum number of outliers. 


4.4.4 Artificial neural networks 


Artificial neural networks (ANN) are widely used in 
Machine Learning in general and in time series data 
forecasting in particular. ANNs are brain inspired and 
they intend to replicate the way that humans learn. They 
mainly consist of input and output layers in addition to 
hidden layers. ANNs are very powerful in finding pat- 
terns that are complex and numerous for humans to 
extract. 

There are multiple types on ANNs, each one is used in 
specific scenarios and has a specific degree of complex- 
ity. The basic type is called feedforward neural network 
in which information travels in one direction (from input 
to output). Another widely used type is the recurrent 
neural network where data can flow in all directions. 
This type is used with complex tasks such as learning 
handwriting, face, and language recognition. Further- 
more, convolutional neural networks and Boltzmann 
machine networks, and many others are used to solve 
machine learning problems. 

ANNs learn in the same way that humans do: from 
past experiences. They require data to learn, and the 
more data they get, the more accurate the prediction. 
Data used to train ANNs is usually divided into three 
subsets: the training set which helps with establish- 
ing the required weights, the validation data set which 
helps fin-tuning the model, and the test set to see if the 
predicted output matches the actual output. 

The main challenge that faces ANNs is the time 
needed to train the model, in addition to the required 
compute power that is needed for complex tasks. Fur- 
thermore, ANNs are considered black boxes that are fed 
with data to produce a certain output [17]. 

According to [18], neural networks perform better 
than many forecasting methods. Hence, we are opting 
for ANNs to develop our prediction model for energy 
consumption. 


Research Article 
4.5 Genetic algorithms for tasks scheduling 


A GA is a search method that is inspired by the theory of 
natural evolution. It mimics the natural selection process 
by selecting the fittest elements for the production of 
the next generation. 

The natural selection process starts by identifying the 
fittest elements from the initial population. These ele- 
ments produce what is called the offspring which inher- 
its the characteristics of the parents and will be given 
to the next generation. If the fitness of the parents is 
good, the offspring will be better and hence the genera- 
tions will survive. The same process keeps iterating until 
a new generation with the fittest elements is formed. 
This concept can be applied to search problems: a set of 
solutions is taken into consideration and the best solu- 
tion is selected [19]. 

The GA goes through five phases: 


— Initial Population: Each process starts with a set of ele- 
ments (also referred to as individuals) called popula- 
tion. Each element of the population is an actual solu- 
tion to the problem. 

— Fitness Function |t determines how fit each element is 
(i.e. its ability to compete with others). The function 
assigns a fitness score on which the probability of 
selection is based. 

— Selection |In this phase, the fittest elements are selected, 
and their genes are passed to the next generation. 

— Crossover This is considered the most significant phase 
of the entire process. It consists of randomly choosing 
a crossover point from the genes of the parents to be 
mated. 

— Mutation |n the offspring formed, some random genes 
can be subject to a mutation. This means that some of 
the bits in the string can be flipped. 


The algorithm keeps executing until the population 
converges (starts producing offspring that is similar to the 
previous ones). Then, we can say that the GA has given a 
set of solutions to the problem. 

The GA cycle is represented in the loop in Fig. 3. 


5 Implementation and results 
5.1 Data set 


To develop and train the model, we used a real-world data- 
set available online [20]. 

Some of the dataset’s attributes are shown in the list 
below: 
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Fig.3 Genetic algorithm cycle 
Table 1 Data set sample 
Timestamp AC Fridge Furnace Microwave 
2018-07-30 9:00 1452.20 101.45 0 6.16 
2018-07-30 10:00 1153.47 133.70 149.11 37.95 
2018-07-3011:00 1090.39 38.67 145.18 6.15 
2018-07-30 12:00 655.75 57.76 87.99 38.93 
2018-07-30 13:00 1626.13 63.43 190.76 6.20 
2018-07-30 14:00 2138.96 0 233.55 5.45 
2018-07-30 15:00 1635.82 94.14 186.96 10.43 
2018-07-30 16:00 2145.36 33.52 233.68 6.86 


— Timestamp it shows the date and time when the energy 
consumption was recorded. 

— Main_ithe energy consumption of the main room. 

— Ac the energy consumption of the air conditioner. 

— Living_rm the energy consumed by the living room 
appliances. 

— Fridge the energy consumption of the fridge. 

— Microwave the energy consumed by the microwave. 

— Furnace energy consumed by the heaters. 


For our model, we used only data about the AC, fridge, 
furnace, and microwave. A sample of the data used is 
shown in Table 1. 


5.2 Artificial neural network model 


To build the prediction model, we opted for a feedforward 
artificial neural network with two input neurons, ten hid- 
den neurons and one output neuron. 

The ANN we are using is depicted in Fig. 4. 

The model designed has as input the time stamp when 
the energy consumption has been recorded, and the 
appliance. For our case, we will be assigning a unique ID 
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Fig.4 ANN model 





Table 2 Appliances’IDs ID Appliance 
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Fig.5 ANN after first iteration 


to each appliance and then feed it to the neural network. 
The appliances’ IDs are shown in the next table (Table 2). 

The output layer of the model designed consists only of 
the energy consumed by each appliance. 

Our input data is a 10 x 2 matrix (10 being the number 
of hidden neurons, and 2 being the number if input neu- 
rons), and the output data is a 10x 1 matrix (1 being the 
number of output neurons). 

After the first iteration, our model looks like Fig. 5. 

To calculate the error of our model, we represented the 
loss function by using the mean sum squared loss. The 
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Fig.6 ANN model loss 
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Fig. 7 Actual versus predicted output 


loss value obtained through the training iterations is rep- 
resented in Fig. 6. 

The loss is found to be between 2.5% and 36% of the 
actual output. 

The actual and predicted outputs are plotted in the 
Fig. 7. 

The study in [21] used artificial neural networks for 
the prediction of Greek long-term energy consumption. 
For this purpose, the authors made use of the multilayer 
perceptron model was investigated in order to test the 
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Fig.8 Appliances’ schedule 


Table 3 Schedule of different appliances 


ID Appliance Start time End time 
1 AC 2 5 
2 Fridge 3 9 
3 Furnace 9 19 
4 Microwave 19 20 


different available architectures, and eventually select the 
optimal one. Furthermore, other prediction methods have 
been used, namely the linear regression and the support 
vector machine model. These models were compared to 
the ANN in terms of the accuracy. 

The ANN model described showed a highest accuracy 
among all the tested prediction methods. The percent- 
age error of the model was about 2%. This is mainly due 
to the high quality of the data used to train and validate 
the model. While we used data from an online source, the 
authors of this work have collected real records which had 
a great impact on the model output. 


5.2.1 Genetic algorithm model 


The GA model we are using takes into consideration the 
four appliances mentioned previously in this paper (i.e. AC, 
Fridge, Furnace, and Microwave). 

Each appliance has a specific start time, working dura- 
tion, and eventually a specific end time. These parameters 
are described in the diagram below (Fig. 8). 

According to the diagram above, we have four resources 
that need to be scheduled daily. By scheduled, we mean 
for each appliances or resource, the start and end time 
need to be determined for each day (Table 3). 

The data fed to the GA is represented as follows: 

The GA model used is scored-based. Each combination 
is assigned a score which determines how accurate the 
schedule is. Hence the best schedule should have a score 
equal or closer to 0. 

After training the model, we have the following 
schedule. 


— Score—5. 
— Start_times [0, 4, 8, 12, 16]. 
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— Appliances (1, 2, 2, 3, 4]. 
5.2.2 Deployment in CompactRIO using LabVIEW 


Programming CompactRIO with LabVIEW is done through 
what is called a Virtual Instrument (VI). It has two compo- 
nents: a front panel and a block diagram. The front panel 
contains the design part of the VI. This means it includes 
buttons, labels, and indicators. The block diagram shows 
the actual wiring of the different components in the 
instrument. 

Since we are integrating python with LabVIEW, we are 
following the architecture in Fig. 9. The VI is to be deployed 
later on the CompactRIO controller. 

In the architecture presented above, the user is sup- 
posed to provide the program with input (mainly the time 
and the appliance ID for the energy consumption predic- 
tion). As a standalone python application, the input is 
given through the system arguments. After the integration 
with the LabVIEW VI, the input is given to the application 
through numeric controls implemented in the VI. 

The inputs are then used by the python scripts through 
a LabVIEW library called “exec”. 

Once the scripts finish execution, the results are shown 
in a numeric indicator. 

The front panel of the VI used for this purpose is shown 
in Fig. 10. 

In the front panel of the VI, we have two numerical 
controls where the user enters the time and the ID of the 
appliance. In addition to that, we introduced a numerical 
indicator that displays the predicted energy consumed at 
the time entered for a specific appliance. 

The block diagram of the VI is shown in Fig. 11. 

The VI has four main components. We are describing 
them below: 


1. Build path to Python script, which is expected to be in 
the same folder as this VI. The Python script could be 
moved, but this VI would have to reflect that change. 

2. Create command line argument, in the form of 
pythonpath scriptpath int32 int32. Quotation marks 


LabVIEW VI 


) 
Controls Python Scripts 
and ANN.py 


Indicators GA.py 


NI CompactRIO 





Fig.9 Program architecture 
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are used for the path, in case there are spaces in the 
path. 

3. Execute the argument. The run in minimized, so that 
the command window does not quickly appear and 
disappear. 

4. Convert the string output from Python to an int. 


6 Discussion 


As you may have noticed from the results, the models did 
not show a very high accuracy compared to other existing 
models. The loss of our ANN model is relatively high which 
makes it a non-reliable method for the energy consump- 
tion prediction. This is mainly due to the quality of data 
used. Our data did not go through any preprocessing. 

Also, as ANNs need to learn from the data provided, our 
dataset did not help the ANN to learn and hence the final 
model was poor. 

There is a huge ongoing debate that is taking place 
between practitioners on where to go for ANNs or statis- 
tical methods for the prediction. The article in [22] contains 
a study that compares ANNs to regression with a medium 
to large dataset of patients (size > 200). According to the 
review, ANN outperformed the regression method in 
36% of the cases and was outperformed in 14% of them. 
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Fig. 11 Block diagram of the VI 
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However, as the dataset size increased (size > 5000), regres- 
sion showed a better performance. 

Authors in [23] have conducted a comparative study 
between ANNs and statistical methods in the prediction 
of consumer's behavior. As results, they found that ANNs 
constantly win the battle against other methods. Due to 
the flexibility of the models, they were able to perform 
better with both known and unknown choice rules. The 
ANN model used in this study is iterative, this implies that 
the model learns complex attributes about consumers. 

ANNs are different from other statistical methods in the 
was that they do not expect any relationship between the 
input and output variables. However, the model overfit- 
ting can represent a real problem when conceiving an 
ANN model. Hence, one needs to make sure that the data 
fed to the model is preprocessed. 


7 Conclusion and future work 


In this paper, we delineated the details of deploying a 
machine-learning-based model for energy consump- 
tion prediction and scheduling in Smart Buildings. The 
presented model can serve as a roadmap for deploying 
real-world SB testbeds, and thus paving the way towards 
grounded research around Smart Buildings for Smart 
Grids. 

We integrated the models using python in a LabVIEW 
program by creating a VI instance that allows the user to 
select an appliance, along with the time of the day, and 
then compute relevant predicted consumption using our 
ANN-based model. The models did not show very high 
accuracy as the dataset used was not that big enough 
for training and validation. When looking, and analyzing, 
similar models in the literature using ANNs for prediction, 
we came to the conclusion that ANNs are not always the 
best and they can be outperformed by basic statistical 
methods, regression being a case in point. Nevertheless, 
this remains an ideal venue to investigate for researchers 
seeking machine learning as long as they can afford good 
data, ideally a big one. 

As a future work, we intend to collect further data from 
our testbed, and get some relevant big data as well, in 
order to tailor better models. In our testbed, we are intend- 
ing to further gather PV production data, energy con- 
sumption of different appliances, and the context data (i.e. 
motion, temperature, humidity, etc.). Last, but not least, we 
will deepen our investigation of existing prediction meth- 
ods by namely incorporating statistical approaches as well. 
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